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There  are  good  reasons  for  using  sequential  methods  in  some  statistical 
decision  problems,  but  a  stopping  rule  that  is  helpful  for  deciding  whether 
0  >  0  or  0  <  0  may  not  be  so  good  for  estimating  0.  This  pap>er  considers  the 
construction  of  confidence  bounds  on  a  real  parameter  and  investigates  the 
relation  between  the  ordering  of  boundary  points  that  are  accessible  under  the 
stopping  rule  and  the  natural  ordering  of  the  parameter  space. 
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1.  INTRODUCTION 


There  are  nemy  ways  of  defining  an  ordering  of  probability  distributions 
so  that  large  values  of  the  parameter  which  labels  the  distributions  correspond 
to  large  values  of  the  random  variables  themselves.  For  example,  Lehmann 
(1955)  gives  a  comparison  of  several  definitions  and  an  application  to  the 
sequential  probability  ratio  test.  We  shall  be  mainly  concerned  with 
stochastic  ordering,  which  is  well-known  in  applied  probability,  and  the 
stronger  relation  of  ordering  by  monotone  likelihood  ratios  (m.l.r.).  The 
latter  is  familiar  in  the  theory  of  hypothesis  testing,  where  it  leads  to 
uniformly  most  powerful  one-sided  tests  for  fixed  samples  from  a  distribution 
in  a  1-dimensionaI  exponential  family.  A  closely  related  application  of  m.l.r. 
is  the  construction  of  uniformly  most  accurate  confidence  bounds  on  the  unknown 
parameter:  see  Lehmann's  book  (1959),  pages  78-80.  The  aim  of  this  paper  is  to 
Investigate  confidence  bounds  determined  after  a  random  stopping  time.  As  we 
shall  see,  the  above  optimality  property  of  one-sided  confidence  bounds  is 
usually  lost  when  we  allow  sequential  sampling.  However,  for  a  large  class  of 
stopping  rules,  we  can  define  an  ordering  of  the  boundary  points  so  that  the 
distributions  of  stopping  points  are  stochastically  ordered  with  respect  to  the 
parameter.  In  general,  this  weaker  ordering  relation  seems  to  be  the  best  that 
can  be  obtained,  which  underlines  the  need  for  caution  when  interpreting 
confidence  bounds  and  intervals  based  on  sequential  data. 

In  a  recent  book,  Siegmund  (1985)  Illustrates  the  advantages  and 
disadvantages  of  sequential  methods.  For  hypothesis  testing,  we  can  achieve  a 
significance  level  and  power  comparable  to  fixed  sample  procedures  with  a 
substantial  reduction  in  expected  sample  size.  On  the  other  hand,  sequential 
sampling  often  leads  to  less  accurate  estimation.  In  the  design  of  clinical 
trials,  estimation  may  be  a  secondary  consideration'-  for  ethical  reasons,  it  is 


important  to  reject  an  inferior  treatment  as  soon  as  possible,  rather  than 
using  it  on  a  larger  sample  of  patients  to  improve  the  estimates  of  its 
performance.  Previous  research  has  been  mainly  concerned  with  procedures  for 
comparing  different  treatments  and  stopping  rules  designed  to  control  the  error 
probabilities  of  decisions  about  their  relative  merits.  More  recently. 

Siegmund  and  others  have  turned  to  questions  of  estimation  from  sequences  of 
observations  produced  by  various  stopping  rules.  In  particular,  they  have  made 
substantial  progress  in  constructing  confidence  Intervals,  in  spite  of  the 
difficult  probability  calculations  and  delicate  approximations  often  needed  to 
deal  with  curved  stopping  boundaries. 

The  construction  of  confidence  Intervals  after  a  stopping  time  is  based  on 
standard  methods.  Consider  first  a  random  variable  Z  whose  distribution 
depends  on  a  real  parameter  6  and  write  ^(z.6)  =  ^  z).  In  general,  a 

lower  confidence  boiuid  for  6  can  be  obtained  by  using  the  fact  that 

PqMZ.B)  i  a)  i  1-a.  (1) 

for  any  fixed  a.  0  $  o  $  1.  Let  0(z)  =  lnf{0:  ^(z.0)  ^  o} ,  so  that 
v(z,0)  ^  a  =>  0(z)  i  0.  It  follows  from  this  definition  and  (1)  that 

Pe(0(Z)  I  0)  I  PqMZ.O)  I  a)  1  1-a.  (2) 

Thus,  0(z)  is  a  lower  confidence  bound  for  0,  given  an  observed  value  z  of  the 

random  variable.  Upper  confidence  bounds  0(z)  can  be  constructed  similarly, 
for  the  same  confidence  coefficient  l-a,  euid  we  then  obtain  confidence 
intervals  with  coefficient  l-2a  from  the  Inequality 

Pe(0(Z)  i  e  iB{z))  1  i-Pg(0{Z)  >  0)  -  ?q(b{z)  <  0). 

This  probability  is  at  least  l-2a.  because  of  (2)  and  a  similar  property  of 


The  above  argument  extends  in  a  straightforward  manner  to  more  complicated 
sample  spaces  where  the  data  consists  of  stopped  sequences  of  observations. 


Let  x^.X2>-<-  be  independent  observations  from  the  probability  density 

f{x;0)  =  exp{ex  -  '<»(0)}.  (3) 

with  respect  to  a  a-finite  measure  p  on  the  real  line.  This  represents  a 

l-dimensional  exponential  family  of  distributions  with  mean  '^'(0)  and  variance 

'/>”(0)  >  0.  The  natural  parameter  space  of  the  family  is  a  real  interval  and  we 

n 

suppose  that  the  true  value  of  0  in  this  interval  is  unknown.  Let  =  2  x^ 

for  n  ^  1  and  note  that  any  stopped  sequence  of  observations  (Xj,X2 . x^)  is 

represented  by  the  sufficient  statistic  (n.s).  where  s.  Suppose  that  the 
stopping  time  is  defined  by  splitting  the  (n.s)  plane  into  continuation  and 
stopping  regions.  It  is  helpful  to  think  of  upper  and  lower  stopping  sets, 
separated  by  the  continuation  region.  Since  '#'"(0)  >  0.  the  mean  ^^»‘(0)  is 
increasing  in  0  and  we  can  imagine  that,  roughly  speaking,  points  (n.s)  in  the 
upper  stopping  set  favour  higher  values  of  0.  However,  we  need  a  more  precise 
ordering  relation  on  the  points  of  the  stopping  region  before  we  can  justify 
constructing  confidence  bounds  on  0. 

The  idea  is  to  order  all  the  accessible  boundary  points  (n.s)  in  a 
counter-clockwise  sense  around  the  continuation  region,  but  this  can  be  done  in 
several  ways.  The  definition  given  in  Section  4  of  this  paper  has  been  used 
previously  by  Siegmund  (1978),  in  obtaining  confidence  intervals  for  a  normal 
mean,  and  also  by  Jennison  and  Turnbull  (1983).  for  estimating  a  binomial 
probability.  The  authors  of  the  second  paper  considered  a  different  ordering 
of  boundary  points  based  on  the  ratio  s/n,  the  maximum  likelihood  estimate  of 
the  unknown  probability,  but  this  produced  very  similar  results  in  the  cases 
they  computed.  Siegmund  (1985)  also  mentions  the  ratio  s/n  as  a  possible  basis 
for  ordering  normal  data. 

In  general,  we  must  determine  a  function  ^(n,s,0)  which  represents  the 
probability  of  stopping  the  sequence  of  observations  at  a  point  "above”  (n,s). 


The  fimctlon  can  then  be  used  to  find  confidence  bounds  on  6  in  the  same  way  as 
^(z,0).  It  is  important  to  note  that  the  construction  described  earlier  does 
not  depend  explicitly  on  any  relation  between  the  parameter  6  and  the  ordering 
of  the  sample  space.  In  principle,  any  ordering  of  points  (n.s)  in  the 
stopping  region  can  produce  valid  confidence  bounds.  However,  it  is  possible 
to  ensure  that  the  function  ^(n.s,6)  is  increasing  in  6.  This  general  prop>erty 
does  not  seem  to  have  been  established  previously,  although  it  is  obviously 
useful  in  computing  confidence  bounds  from  the  formula 

0(n,s)  =  inf{0:  ^(n,s,0)  ^  a}.  (4) 

The  rest  of  this  paper  is  organized  as  follows.  The  next  section  gives  a 
simple  example  which  demonstrates  the  advantages  and  disadvantages  of 
sequential  procedures.  It  also  suggests  an  arbitrariness  in  the  construction 
of  confidence  bounds  which  is  not  easy  to  eliminate  entirely.  However,  we  can 
distinguish  between  bounds  that  are  formally  valid  and  more  sensible 
constructions  that  are  also  related  to  the  unknown  parameter.  Section  3  is 
concerned  with  partial  orderings  of  probability  distributions.  It  contains  a 
brief  description  of  cuid  comparison  between  stochastic  ordering  and  ordering  by 
m.l.r.  There  is  a  new  result  about  the  most  likely  permutation  of  a  number  of 
independent  random  variables  from  an  ordered  feunily  of  distributions.  The 
property  that  the  most  likely  permutation  corresponds  to  the  ordering  of  the 
family  holds  for  m.l.r.  but  not.  in  general,  for  stochastic  ordering:  see 
Proposition  4.  The  ordering  of  boundary  points  for  the  random  walk  {S^,  n  ^  1} 
is  considered  in  Section  4  and  it  is  proved  that,  for  a  large  class  of  stopping 
times  N.  the  distributions  of  the  random  point  (N.Sj^)  are  stochastically 
ordered  with  respect  to  0.  This  means  that  the  function  ^(n.s,0)  is  increasing 
in  0.  The  final  section  of  the  paper  gives  cmother  illustration,  based  on  a 
simple  acceptance/rejection  scheme  for  diffusion  processes  with  unknown  drift 
parameters.  It  shows  that  the  intuitive  counter-clockwise  ordering  of  random 


s 

stopping  points  with  respect  to  the  drift  can  be  destroyed  by  conditioning  on 
rejection.  The  pai>er  concludes  with  some  tentative  remarks  about  the  design  of 
stopping  rules. 

2.  EXAMPLE 


Consider  independent  Bernoulli  trials  with  success  probability  p  and  let 

Xj  =  ±1,  according  as  the  i-th  trial  results  in  success  or  failure.  i=1.2 . 

Thus,  we  have  an  exponential  family  of  distributions  which  can  be  expressed  in 

the  form  (3)  by  writing  0  =  Vilog(p/q).  ^(0)  =  log(e®+e  ®),  where  q=l-p. 

However,  the  usual  notation  for  Bernoulli  trials  will  be  more  convenient  here. 

Suppose  that  we  must  decide,  after  observing  a  nximber  of  trials,  whether  p  >  14 

or  not.  Various  stopping  rules  will  be  considered  for  the  random  walk  {S  }, 

n 

n 

2  Xj.  In  each  case,  the  terminal  decision  at  a  point  (n,s)  with  S^=  s  will 

depend  only  on  the  sign  of  s  :  we  conclude  that  p  >  M  if  s  >  0.  that  p  <  14  If 
s  <  0  and  we  choose  either  decision  by  tossing  a  coin  if  s  =  0. 

We  now  turn  to  the  stopping  rules.  Rule  1  is  to  take  4  observations  and 
then  reach  a  decision  about  p.  according  to  the  sign  of  S^.  Rule  2  is  a 
sequential  modification  of  it:  we  observe  Xj  and  Xg  and  stop  if  82=  ±2.  but  if 
§2=  0  we  take  another  4  observations.  This  modification  can  be  used  repeatedly 
to  produce  a  series  of  rules.  Rule  k  is  specified  as  follows:  observe  the 
sequence  {8^,82....}  and  stop  as  soon  as  §2^^=  ±2,  but  if  82^=  0  for 
n  =  l,2....,k-l,  take  4  more  observations  and  stop  at  n  =  2k+2.  The  limiting 
form  of  these  rules,  with  k  =  ».  can  be  regarded  as  a  sequential  probabi  lity 
ratio  test.  As  we  shall  see.  Rule  *  is  an  Improvement  on  its  predecessors  both 
with  regard  to  expected  sample  size  and  with  regard  to  error  probability. 

Let  n'jj(p)  denote  the  expected  sample  size  for  Rule  k  and  let  ^He 

probability  of  reaching  a  terminal  decision  that  p  >  M.  The  corresponding 
error  probability  is  given  by: 


(5) 


ajCp)  =  P^(l+2q)  and  aj^+jCp)  =  p^+2pqaj^(p)  for  k=1.2 


eij(p)  =  aj^(p) .  0  $  p  i  K.  (5j 

ejjCp)  =  l-ajjCp).  Vi  ^  P  ^  1. 

We  now  prove  that,  for  all  p. 

■•iCp)  1  ™2(P)  ^  ^  «n»(p).  (6) 

ej(p)  1  CgCp)  1  ...  I  e„(p).  (7) 

Proof ■  Clearly.  m^(p)  =  4  and  in2(p)  =  2+2pqmj(p).  It  follows  that 

'n2(P)  ^  ™i(P)"  with  equality  only  If  p  =  Vi.  Then  by  considering  S2. 

inj^^j(p)  =  2+2pqinj^(p)  and  an  inductive  argument  shows  that  "*jj+j(p)  i 

all  k.  As  k  -♦  «.  mj^(p)-»  m„(p)=  2(p^+q^)~^. 

The  proof  of  (7)  Is  similar,  but  we  need  to  use  (5).  Note  that 

.  .  It  Is  easy  to  show 

by  induction  that  ajj+j(P)  ^  ®lc(P)  if  0  <  P  <  V<  and  aj^+j(p)  < 

Vi  <  p  <  1.  We  also  have  aj^(p)  =  P.  whenever  p  =  0,  Vi  or  1.  so  the  relations 

(7)  follow  Imnediately  from  (5).  In  fact,  explicit  formulae  for  aj^(p)  can  be 

2  2  2  -1 

obtained  and.  in  the  limit.  a^(p)  =  P  (p  +q  ) 

The  inequalities  (6)  and  (7)  show  that,  so  far  as  terminal  decisions  are 
concerned,  the  performance  of  the  stopping  rules  improves  as  k  increases. 
However,  their  relative  merits  for  estimation  are  quite  different.  Consider 
first  the  unbiased  estimation  of  p.  It  turns  out  that  there  is  Just  one 

A  A 

unbiased  estimator  Pj^.  based  on  Rule  k.  In  particular,  Pj=  (S^+4)/8  eind  its 

A 

variance  is  pq/4.  For  k  ^  2,  Pj^=  (S2+2)/4  is  unbiased  and  this  has  variance 
pq/2.  Standard  methods  can  be  used  to  verify  that  Pj^  is  the  unique  unbiased 

estimator  of  p.  but  we  shall  omit  the  details.  Thus,  for  Rules  2,3 .  the 

minimum  variance  unbiased  estimator  of  p  depends  only  on  the  first  two 
Bernoulli  trials.  From  this  point  of  view.  Rule  1  is  preferable. 

Now  consider  confidence  bounds  on  p.  Instead  of  making  a  compiarison  of 
different  rules,  we  shall  restrict  attention  to  Rule  2.  The  stopping  region 
consists  of  7  points:  (2,-2).  (6,-4),  (6,-2).  (6,0),  (6,2),  (6,4),  (2,2),  and 


let  us  label  these  1.2,..., 7  In  counter-clockwise  order.  Their  labels  here  are 
related  to  p  in  the  following  sense.  Let  the  probability  that  the 

random  walk  stops  at  a  point  whose  label  is  at  least  J.  Then  ^(J.p)  is 

non-decreasing  in  p  for  j=l,2 . 7.  This  is  a  consequence  of  the  general 

result  which  will  be  proved  in  Section  4.  However,  there  are  several  possible 
orderings  of  the  7  points  with  this  property.  For  example,  it  is  not  difficult 
to  verify  that  it  also  holds  if  the  labels  (1,2, 3, 4, 5, 6. 7)  are  replaced  by 
(1 ,3, 2. 4, 6,5,7) .  respectively.  Another  ordering  of  the  stopping  region  that  is 
plausible  from  a  different  point  of  view  is  obtained  on  replacing  the  original 
labels  by  (2, 1 ,3, 4. 5.7,6) .  It  is  arguable  that  the  point  (6,4),  representing  5 
successes  in  6  trials,  indicates  higher  values  of  p  than  the  point  (2,2).  To 
fix  ideas,  suppose  that  we  observe  a  sequence  of  Bernoulli  trials  which 
terminates  at  the  point  (6,4).  In  order  to  construct  a  lower  confidence  bound 

on  p,  we  need  to  specify  the  set  A  of  boundary  points  above  the  data.  There 

6 

are  2^  s  64  possible  definitions  of  A.  It  is  easy  to  see  that  32  of  these 
would  lead  to  the  trivial  claim  that  0  ^  p  $  1,  but  the  others  produce 
confidence  bounds  that  make  more  sense.  For  exsunple,  the  3  possible  orderings 
mentioned  above  would  lead  to  different  statements  of  the  form:  2  <  p  <  1.  for 
the  same  confidence  coefficient. 

3.  ORDERING  OF  RANDOM  VARIABLES 

We  now  consider  two  different  partial  orderings  of  probability 
distributions  on  the  real  line.  A  brief  outline  of  their  properties  is  given 
below.  Let  Y  and  Z  be  random  variables  with  distribution  functions  G  and  H, 
respectively.  Note  that,  if  we  define  G  ^(u)  =  lnf{y:  G(y)  ^  u).  for 
0  <  u  <  1,  then  the  distribution  of  Y  can  be  described  by  writing  Y  =  G  ^(U), 
where  U  is  uniformly  distributed  on  [0.1]. 

Definition  1.  We  say  that  Y  Is  stochastically  less  than  Z  and  write  Y  Z  if 


E(v(Y))  i  E(v(Z))  for  every  bounded  increasing  function  v  on  R. 


Proposition  1.  Y  $  Z  <=>  G(t)  ^  H(t).  t  e  F.  Hence,  if  Y  ^  Z.  we  can  write 
Y  =  G  ^(U),  Z  =  H  ^(U).  where  U  is  uniformly  distributed  on  [0,1].  For  this 
representation  of  the  Joint  distribution,  the  inequality  Y  ^  Z  always  holds. 

Suppose  further  that  Y  and  Z  have  probability  densities  g  £ind  h,  with 
respect  to  a  common  a-finite  measure  p  on  F. 

Definition  2.  We  say  that  Y  is  less  than  or  equal  to  Z  in  the  sense  of 
monotone  likelihood  ratio  and  write  Y  Z  if  h(t)/g(t)  is  non-decreasing  in 
teF  (excluding  t  such  that  g(t)  =  h(t)  =0). 

Proposition  2.  Y  ^  Z  =>  Y  ^  .  Z. 

r  s  V 

Proofs  of  the  above  results  can  be  found  in  Lehmann’s  paper  and  there  is  a 
clear  exposition,  for  discrete  random  variables,  in  the  paper  by  Whitt  (1979). 
This  also  Introduces  the  notion  of  uniform  conditional  stochastic  order 
(u.c.s.o.),  which  is  investigated  more  generally  in  Whitt  (1980).  For  our 
purposes,  it  will  be  enough  to  note  one  of  the  results  from  the  last  paper. 

For  distributions  with  probability  densities  on  the  real  line,  u.c.s.o.  is 
equivalent  to  m.l.r.  in  the  following  sense.  Let  B  C  F  be  a  Borel  set  and 
consider  the  probability  distributions  of  Yg  and 
and  Z,  respectively,  on  the  event  B. 

Proposition  3.  Yg  Zg  for  every  event  B  <=>  Y  Z. 

There  is  another  way  of  seeing  that  ordering  by  m.l.r.  is  stronger  than 
stochastic  ordering.  Suppose  that  we  have  an  ordered  sequence  of  random 
variables  Yj.Yg . Yj^  and  that  they  have  probability  densities  gj.g2 . 


Zg  obtained  by  conditioning  Y 
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with  respect  to  p.  Suppose  further  that  they  are  Independent  of  one  another 
and  consider  whether  the  most  likely  ordering  of  their  observed  values  is 


Y,  $  Y, 


Proposition  4.  (i)  Let  Yj .Yg, . . . .Yj^  be  Independent  random  variables  and  suppose 

that  Yj  Y2  Yj^.  Let  . Wj^)  be  any  permutation  of  the 

integers  1,2 . k.  Then 

P(Y, 

(ii)  This  property  does  not  hold,  in  general,  for  k  ^  3  Independent  random 
variables  such  that  Y^  ^sf  -^st  \- 


Proof .  We  shall  establish  Part  (i)  by  showing  that 

gi(yi)g2(y2>  •«k^yk>  ^  - 

holds  at  every  point  of  =  {(yj,y2 . Yj^)’  Yj  ^  y2  Yj^)-  required 

result  will  then  follow  by  integrating  over  the  set  Cj^. 

Note  first  that  gj(y^)g2(y2)  ^  g2(yi)Ki(y2)  y2’  holds  for 

k  =  2.  Now  let  k  ^  3  and  assume  that 


gl(yi)g2(y2)  -  gk-l(yk-l^  ^  «a/yi^®a2^y2^  -  \_/yk-l> 

holds  in  for  any  permutation  a  of  1.2 . k-1 .  If  Wj^=  k.  then  (8)  is  a 

trivial  consequence  of  (9),  so  we  may  assume  that,  for  some  j  <  k.  irj=  k  and 
irj^<  k.  We  define  a  in  (9)  by  a^=  w^  if  i  ^  j,  i  $  k-1,  and  t7j=  Wj^.  It  is  now 
a  straightforward  matter  to  deduce  (8).  by  using  the  fact  that,  since  yj<  yj^. 


*k<Yk)  ^  «k<Yj>\<Yk>'%<''j>  '  «,j<Yj>\<Yk>'*a/Yj> 

The  proof  of  Part  (il)  is  based  on  a  counter-example.  Let  Yj.Y^.Y^  be 
Independent  and  uniformly  distributed  on  [-1.1].  Define  Yj=  min(Yj.O).  Y2=  Y2. 
Y^s  max(Y2.0).  Proposition  1  can  be  used  to  show  that  ^  st  ^"2  ^  st  ^3'  ^ 


the  other  hand. 
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P(Yj  ^  Y2  ^  Y3)  =  but  P{Y2  ^  Yj  $  Y3)  =  P(Yj  i'i^  i  Y2)  =  |. 

It  is  easily  shown  that  the  result  of  Part  (1)  remains  true  for  stochastic 
ordering  in  the  case  k=2.  However,  it  does  not  hold  for  k  2  4  and  this  can  be 
demonstrated  by  extending  the  above  example. 

4.  ORDERING  OF  STOPPED  SEQUENCES 

n 

We  now  return  to  the  random  walk  model  described  earlier,  with  S  =  2  x. . 

n  j  i 

n  ^  1,  where  the  steps  x^  are  generated  by  independent  observations  from  the 
distribution  defined  by  (3).  We  restrict  attention  to  stopping  times  specified 
by  two  sequences  of  numbers.  Let  ^  $  b^  ^  ®  for  n=l,2....  and  let 

N  =  min{n>l:  C 

Clearly,  N  ^  m  =  min{n  ^  I-'  a  =  b  },  if  this  is  finite.  If  a  <  b  for  all  n, 
we  set  m  =  «  and  the  stopping  time  N  may  be  infinite.  However,  it  is  assumed 
in  such  cases  that  N  is  finite  with  probability  1,  for  any  value  of  the 
parameter  0.  This  means  that  the  stopping  pwint  (N,Sj^)  always  has  a  proper 
distribution. 

The  stopping  region  associated  with  N  consists  of  points  (n.s)  such  that 
1  ^  n  ^  m  and  either  s  ^  a^  or  s  ^  b^.  This  cam  be  regarded  as  a  totally 
ordered  set. 


'?i 

<k'‘> 


m 

■> 


Definition  3.  Let  (n.s)  and  (n’,s’)  be  points  of  the  stopping  region.  We  say 
that  (n',s')  is  above  (n.s)  and  write  (n'.s’)  )  (n.s)  if  one  of  the  following 
conditions  holds: 


(1) 

n'=  n 

and 

s’^  s. 

(11) 

n'<  n 

and 

s-^bn. 

(111) 

n'>  n 

and 

s  $  a^. 

For  example,  it  is  a  straightforward  matter  to  check  that  either 


11 


(n',s')  ^  (n.s)  or  (n.s)  ^  (n'.s'),  for  every  pair  of  stopping  points,  and  that 
the  relation  ^  Is  transitive. 

We  are  now  In  a  position  to  prove  the  main  result.  Let 

^(n.s.0)  =  PqCCN.Sjj)^  (n.s)).  (11) 

Theorem .  Under  the  above  conditions,  the  fxinctlon  <#>  Is  non-decreasing  In  0: 

(#»(n.s.6')  I  >f>(n.s.e) 

whenever  0'>  0.  for  any  point  (n.s)  In  the  stopping  region. 

Proof .  We  shall  couple  together  two  realisations  of  the  random  walk. 

corresp>ondlng  to  the  parameter  values  0  and  0'.  It  follows  from  (3)  that  the 

likelihood  ratio  for  a  single  observation  x  Is  exp((0‘-0)x-('/>(0' )->/'(0))}  and 

this  is  increasing  in  x  If  0'>  0.  Hence,  we  can  associate  random  variables  X 

and  X'  with  0  and  0'.  respectively,  such  that  X  X’.  Let  F(x:0)  be  the 

distribution  function  determined  by  (3).  Then,  according  to  Propositions  1  and 

2.  we  can  describe  the  two  distributions  by  writing  X  =  F  ^(U:0)  and 

X'=  F  ^(U:0'),  where  U  is  uniformly  distributed  on  [0.1],  Now  let  u^.Ug...-  be 

independent  observations  from  the  uniform  distribution  and  consider  the 

n  n  _j 

realisations  generated  by  setting  S  =  2  x. ,  S'=  2  x ’ .  where  x.=  F  (u  ;0). 

X)  ^  X  ZX  ^  X  X  a 

xj=  F  ^(Uj.-O’).  and  hence  i  xj  always  holds.  We  must  compare  the  stopping 
points  associated  with  0  and  0’.  Given  the  sequence  (Uj.Ug.---)*  apply 

(10)  to  determine  points  (N.Sj^)  and  (N'.Sj^.).  say.  Then  it  follows  from 
Definition  3  and  the  fact  that  ^  S^.  for  all  n  ^  1.  that  (N'.Sj^.)^  (N.Sj^). 
Thus,  we  have  generated  the  stopping  points  from  independent  uniform  random 
variables  in  such  a  way  that  the  event  [(N.Sji^))  (n.s)]  is  contained  in  the 
event  [(N’.Sj^.))  (n.s)].  The  theorem  follows  imnediately . 


Remark.  The  above  argument  can  also  be  used  to  show  that,  for  any  bounded 
increasing  function  v  defined  on  the  stopping  region.  £(v(N.Sj^))  i  E(v(N‘ .Sj^, )) 
if  6  <  6'.  Hence,  the  random  point  (N.Sj^)  is  stochastically  increasing  in  6, 
in  the  sense  of  Definition  1. 


5.  ILLUSTRATION 

The  monotonicity  of  the  function  ^(n.s.6)  established  in  the  theorem  is 
useful  in  constructing  confidence  boiinds.  but  it  does  not  mean  that  such  bounds 
are  optimal.  The  uniformly  most  accurate  confidence  bounds  mentioned  earlier 
are  obtained  only  if  the  random  stopping  point  (N.S^)  is  increasing  with 
respect  to  6  in  the  stronger  sense  of  m.l.r.  This  is  exceptional:  roughly 
speaking,  fixed  samples  lead  to  uniformly  most  accurate  one-sided  confidence 
intervals,  but  random  stopping  times  do  not. 

We  can  easily  see  why.  by  examining  likelihood  ratios  for  the  exponential 
model  (3).  After  n  observations,  suppose  we  find  that  s.  The  likelihood 
ratio  for  parameter  values  0  <  0'  is  exp{(0*-0)s-(><>(0' )-^(0))n} .  This  is 
Increasing  in  s,  so  if  the  sample  size  is  fixed  in  sidvance  at  n.  the  m.l.r. 
property  holds.  Now  suppose  that  (n.s)  and  (n'.s')  are  points  of  the  stopping 
region  with  nV  n.  The  second  point  yields  a  higher  likelihood  ratio  if  and 


only  if 


We  could  extend  this  comparison  of  points  to  produce  an  ordering  relation  on 
the  stopping  region  but.  in  general,  the  relation  would  depend  on  our  choice  of 
parameter  values,  since  >  0  and  the  coefficient  ('^'(0')-  Is 

not  constant.  In  the  case  of  independent  Bernoulli  trials,  it  is  not  difficult 
to  devise  random  stopping  times  in  such  a  way  that  the  m.l.r.  property  holds, 
but  it  is  worth  noting  that  in  the  example  discussed  in  Section  2.  only  Rule  1 
with  a  fixed  sample  size  produces  a  stopping  region  that  has  the  m.l.r. 
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property. 

Finally,  consider  a  simple  acceptance/rejection  scheme  based  on  a  process 
in  continuous  time.  Let  S(t)=  0t+W(t)  for  t  2  0.  where  S(0)=  0  and  W(t)  is  a 
stzuidard  Wiener  process.  Suppose  that  S(t}  is  a  suinnary  of  responses  in  [O.t] 
to  a  new  medical  treatment  and  that  positive  values  of  the  unknown  parameter  6 
represent  a  higher  risk  of  serious  adverse  effects.  Let  b  and  m  be  fixed 
positive  numbers  and  let  the  decision  procedure  be  specified  as  follows:  stop 
and  reject  the  treatment  as  soon  as  S(t)=  b  if  this  occurs  for  some  t  <  m; 
accept  the  new  treatment  if  S(t)<  b  for  0  ^  t  ^  m.  A  detailed  evaluation  of 
this  procedure  is  given  in  Siegmund’s  book  (1985):  see  Chapter  3.  Here,  the 
aim  is  to  illustrate  some  consequences  of  Propositions  3  and  4. 

Since  the  boundary  prevents  any  overshoot,  the  stopping  region  consists  of 
two  lines  in  the  (t,s)  plane.  Strictly  speaking,  the  theorem  of  Section  4  does 
not  cover  processes  in  continuous  time,  but  it  is  easy  to  verify  that,  for  a 
counter-clockwise  ordering  of  the  boundary,  we  have  stochastic  ordering  of  the 
distributions  of  the  terminal  point  with  respect  to  6.  For  two  values  6  and  6' 
of  the  drift  parameter,  the  likelihood  ratio  at  any  boundary  point  (t,s)  is 

exp{(e’-0)s  -  yS(0'^-0^)t}.  (12) 

In  cases  of  acceptance,  t  =  m  and  this  is  increasing  in  s  provided  that  9'>  6. 
Rejected  cases  occur  on  the  line  s  =  b,  for  t  <  m,  and  there  the  likelihood 
ratio  is  Increasing  in  the  counter-clockwise  direction  (i.e.  decreasing  in  t) 
if  and  only  if  |0' |  >  |0|. 

Consider  the  results  of  applying  the  scheme  independently  to  k  different 
treatments  and  suppose  the  corresponding  drift  parameters  are  in  the  order: 
01>02^' • • >0k-  Intuitively,  it  might  seem  that  the  most  likely  arrangement  of 
the  corresponding  terminal  points  is  (tj.Sj))  (tg.Sg)^-..^  ^^k’®k^’ 
obvious  extension  of  Definition  3.  However,  this  may  not  be  true.  Proposition 
4  applies  if  |0, I  i  |0_|  |0.  I.  but  let  us  assume  that  this  last  condition 


does  not  hold.  We  can  argue  conditionally.  If  all  the  treatments  are  accepted. 


Because  of  the  m.l.r.  property  on  the  line  t  =  m.  given  acceptance,  the  most 


likely  arrangement  is  b  ^  s^  ^  S2  ^ _ i.  s^^.  However,  after  rejection,  the 


observed  arrangement  of  the  final  times  could  be  quite  misleading.  Conditional 


on  rejection,  we  have  the  m.l.r.  property  based  on  |0(.  rather  than  0.  and  the 


possibility  of  false  rejections  (l.e.  cases  with  0^<  0)  makes  the  situation 


more  complicated.  The  most  likely  arrangement,  given  rejection,  need  not  be 


the  one  with  tj^  tg^---^  tj^<  m. 


More  generally,  suppose  we  have  a  stopping  region  determined  by  two  smooth 


boundary  curves:  s  =  a(t)  and  s  =  b(t).  The  process  {S(t)}  is  allowed  to 


continue  so  long  as  a(t)<  S(t)<  b(t).  We  can  see  by  using  (12)  that  on  the 


upper  boundary  curve,  the  likelihood  ratio  is 


exp{(0'-0)(b(t)-  y<{0+0‘)t)} 


and  this  is  Increasing  in  the  counter-clockwise  direction  if  0’>0  and  if  the 


derivative  b'(t)  <  M{0+0*).  In  the  special  case  where  b'(t)=  0,  we  noted  that 


negative  values  of  the  drift  led  to  complications  in  relating  the  order  of 


parameter  values  to  the  order  of  stopping  points  on  the  boundary.  Here  we  can 


say  roughly  that  the  idea  of  a  counter-clockwise  ordering  of  boundary  points 


remains  valid,  conditional  on  stopping  near  the  point  (t,b(t)).  provided  that 


we  are  concerned  with  values  of  the  drift  leading  towards  the  boundary  (i.e. 


0  >  b'(t)).  Similar  remarks  apply  to  the  lower  boundary  for  values  of  the 


drift  0  <  a'(t).  It  seems  that  we  should  try  to  design  stopping  rules  so  that 


there  is  always  a  high  probability  that  the  random  process  will  reach  a 


stopping  point  where  the  expected  increments  lead  towards  the  boundary,  rather 


than  away  from  it. 


a*** 
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